Git and Github for Remote Collaboration

@javirudolph

Objective

  • Maintaining code for scientific collaboration as a main objective.
  • Effective ways to store, track changes, and enable collaboration on code.

Why Github?

  • it is the most used for version control and collaboration.
    • integrates communication features
      • issues
      • discussions
      • pages
    • engage and collaborate on code, but also publish info to a webpage.

The difference between Git and Github

  • Git is the version control system that enables all the collaborative tools available on Github.
  • Git launched in 2005.
    • Basic concepts of git: commit, push, pull, checkout.
  • Git operations through the terminal.
  • Github is web-based: functionality available to users less familiar with software development

Github, GitLab, and BitBucket

They are all similar and they provide hosting services, which is basically a home for your project on the internet.

It’s like having a DropBox or GoogleDrive but for git-based projects.

This allows other people to see your stuff, synchronize it, and contribute.

Some Github features

  • Well-designed user interface

  • Issues originally a bug tracker but highly underutilized in our fields

  • R and Github integration is nicer due to the active R package development community.

  • An intro on this can be found here and here

Step-by-step process:

Artwork by @allison_horst

  1. Create remote repo and sync with files and directory locally.
  2. Modify files locally or remotely
  3. Frequently ‘commit’ the changes with a description of the changes
  4. Synchronize commits with Github (push and pull)

Practical ways to use:

  1. Storage: just because you can version control something, doesn’t mean that you should.
  • plain text-based documents. Git stores the original file first, and then takes up very little space by only tracking the differences between versions.
  • Things not to version control are large data files that never change.
    • If code is fully reproducible, you shouldn’t need to store the output.
  • Better ‘storage’ for long term - zenodo

Practical ways to use: (cont.)

  1. Project continuity
  • So many researchers hold limited-term appointments
    • Keeping docs on personal computers only does’t work for file transfer when people move on.
    • Easier code and data handover - stop it with the emails
    • Assigning tasks

Practical ways to use: (cont.)

  1. Project management: useful for highly collaborative research.

Intro Resources

For the R user, best simple straightforward resource out there is Happy git with R

Github itself has a dedicated section for learning in the docs and in particular, the Hello World tutorial will get you creating a repo, managing a branch and merging a pull request.

Branches and pull requests

  1. Create Branch to make a change.
  2. Commit changes to the new branch.
  3. Open Pull request to merge the changes to main branch.
  4. Optional and recommended: delete branch

source https://www.nobledesktop.com/learn/git/git-branches

Demo